Fixing indexing regression and bug fixes for grouping criteria#20145
Fixing indexing regression and bug fixes for grouping criteria#20145Bukhtawar merged 1 commit intoopensearch-project:mainfrom
Conversation
📝 WalkthroughWalkthroughRemoved LookupMapLockAcquisitionException and its usages; refactored CompositeIndexWriter to track per-child pending docs and improve tragic-exception handling; changed writer APIs from Iterable to List; added no-op source-derivation overrides to ContextAwareGroupingFieldMapper; updated/added tests and test helpers; minor mapper, codec, and changelog edits. (50 words) Changes
Sequence Diagram(s)sequenceDiagram
actor IndexingRequest
participant CompositeWriter
participant IndexWriterLookup
participant ChildWriter
participant AccumulatingWriter
IndexingRequest->>CompositeWriter: addDocuments(docs: List, uid, size)
CompositeWriter->>CompositeWriter: increment childWriterPendingNumDocs(size)
CompositeWriter->>IndexWriterLookup: getIndexWriterForDoc(uid)
IndexWriterLookup-->>CompositeWriter: ChildWriter (criteria)
CompositeWriter->>ChildWriter: write addDocuments(docs)
ChildWriter-->>CompositeWriter: ack / exception
CompositeWriter->>AccumulatingWriter: maybe flush/merge accounting
CompositeWriter->>CompositeWriter: update pending counters (subtract on refresh if child moved to old)
CompositeWriter-->>IndexingRequest: return sequence/ram/flushing info or propagate tragic exception
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
215-215: Reduced encapsulation ofmapReadLock.The visibility of
mapReadLockhas been changed fromprivate finalto package-private, allowing direct access from other classes in the same package. This field controls critical concurrency behavior, and exposing it directly increases the risk of misuse.Consider:
- Keeping the field
privateand exposing only necessary operations through methods (e.g.,tryAcquireLock()).- If package-private access is required for the retry logic, add clear documentation about proper usage patterns and thread-safety requirements.
- Restrict access using a package-private accessor method rather than exposing the field directly.
498-498: Simplify boolean comparisons.The condition uses explicit
== falseand== truecomparisons which are redundant in Java.Apply this diff:
-if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) { +if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) { current.mapReadLock.close(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: gradle-check
🔇 Additional comments (13)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!Import correctly added to support the stream operations in
getCompositeFieldTypes().
693-697: Verify the behavior change scope and call frequency.The filtering to return only
StarTreeMapper.StarTreeFieldTypeinstances represents a narrowed scope from returning all composite field types. Confirm this change is intentional and whether any callers expect otherCompositeMappedFieldTypeimplementations. Additionally, verify the call frequency of this method; if invoked on hot paths, consider caching the filtered result to avoid repeated stream collection operations.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
106-106: LGTM!The test constant is appropriately set to a lower value (20) than the production default (100) for faster test execution while still being within the valid range (5-500).
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
44-46: LGTM!Mockito imports correctly added to support the new verification test.
71-77: LGTM!Method call correctly updated to include
MAX_NUMBER_OF_RETRIESparameter, aligning with the new bounded retry API.
141-146: LGTM!Method call correctly updated with retry parameter.
197-202: LGTM!Method call correctly updated with retry parameter.
208-227: Test validates bounded retry semantics correctly.The test properly verifies:
LookupMapLockAcquisitionExceptionis thrown after exhausting retriestryAcquire()is called exactlyMAX_NUMBER_OF_RETRIEStimesOne consideration: the mock setup directly assigns to
map.currentandmap.current.mapReadLockwhich accesses package-private fields. This works for testing but creates tight coupling to internal implementation details.server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
724-753: Retry logic moved to lower layer - verify exception handling.The
LookupMapLockAcquisitionExceptionretry logic has been removed from bulk action handling and moved toCompositeIndexWriterwith bounded retries. This architectural approach places retry logic closer to where the exception originates.Ensure that when
LookupMapLockAcquisitionExceptionpropagates up after max retries are exhausted, it's properly handled and doesn't cause unexpected bulk operation failures.server/src/main/java/org/opensearch/index/IndexSettings.java (1)
499-506: Significant default value change - verify upgrade impact.The default retry count increased to 100 with a maximum of 500. Since this is a dynamic setting, existing indices will apply the new default upon upgrade. Consider whether this change should be documented in release notes for operators who have tuned their clusters based on previous defaults.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (3)
691-693: LGTM: Metrics gathering refactoring.The refactoring from stream-based iteration to explicit for-loops improves code clarity and performance for these simple aggregation operations. The logic is correct in all cases, with proper handling of both current and old maps where necessary, and appropriate locking in
ramBytesUsed().Also applies to: 702-704, 731-742, 758-770, 796-806
210-210: Verify removal offinalmodifier is intentional.The
finalmodifier has been removed fromCriteriaBasedIndexWriterLookup,CriteriaBasedWriterLock, andLiveIndexWriterDeletesMap. This allows subclassing of these internal implementation classes. Confirm whether:
- Subclassing is required for test mocking/stubbing.
- If so, consider restricting visibility to test scope or use sealed classes.
- If intentional for production extensibility, document extension points and invariants.
Also applies to: 301-301, 406-406
678-679: Verify retry configuration defaults and bounds.The
maxRetryOnLookupMapAcquisitionExceptionsetting controls retry behavior for lookup map acquisition. Without access to the codebase, I cannot confirm the specific default values, upper/lower bounds, or whether they align with industry best practices for lock acquisition retry mechanisms. Ensure the following:
- Default retry count is reasonable (typically 3–10 attempts for in-thread locks per best practices)
- Upper bounds prevent excessive retries that could cause performance degradation
- Configuration is documented and tunable per workload expectations
- Retry strategy includes backoff/jitter to avoid retry storms (not just blind spinning)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
475-489: Add exponential backoff to retry loop.The retry loop performs a busy-wait by calling
tryAcquire()repeatedly without any delay between attempts. This pattern wastes CPU cycles under contention when the lock is frequently unavailable.As suggested in the previous review, add exponential backoff between retry attempts with
Thread.sleep(Math.min(1L << counter, 100L))and handleInterruptedExceptionby restoring the interrupt flag and propagatingLookupMapLockAcquisitionException.
🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
483-483: Simplify boolean comparisons.The expressions
current.isClosed() == true(line 483) andcurrent.mapReadLock.isHeldByCurrentThread() == true(line 498) contain redundant boolean comparisons.Apply this diff to simplify:
- if (current != null && current.isClosed() == true) { + if (current != null && current.isClosed()) {- if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) { + if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) {Also applies to: 498-498
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (5)
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- CHANGELOG.md
- server/src/main/java/org/opensearch/index/IndexSettings.java
- server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
🔇 Additional comments (6)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-731: Behavior change: retries now only on version conflicts inonCompleteThe retry path in
onCompletenow exclusively handlesVersionConflictEngineException(viaisConflictException) bounded byretryOnConflict, with the previousLookupMapLockAcquisitionException-specific retry branch removed. That seems consistent with moving lock-acquisition retry logic down intoCompositeIndexWriterand keeps this layer from depending on engine-internal exception types.Please double-check that:
LookupMapLockAcquisitionExceptionis fully retried/consumed withinCompositeIndexWriter(or below), and- it does not bubble back up to this point expecting a retry here anymore,
so we don’t inadvertently change failure behavior for callers that previously relied on the bulk action to transparently retry on transient lookup-map lock contention.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)
210-210: Visibility changes support the new retry mechanism.The removal of
finalmodifiers from inner classes and the change ofmapReadLockto package-private access are necessary to support the refactored retry logic. The package-private access onmapReadLock(line 215) enablesLiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteriato calltryAcquire()directly at line 482.Also applies to: 215-215, 301-301, 406-406, 408-408
466-471: Proper parameter propagation for configurable retry limit.The addition of
maxRetryOnLookupMapAcquisitionExceptionparameter enables configurable retry behavior, and the value is correctly sourced fromengineConfig.getIndexSettings().Also applies to: 675-680
691-693: Simplified iteration improves readability.The refactoring from stream-based iteration to direct iteration over
criteriaBasedIndexWriterMap.values()is cleaner and avoids unnecessary intermediate operations.Also applies to: 702-704
731-750: Proper tragic exception checking across all writers.The iteration through both current and old writer maps to detect tragic exceptions is thorough and correctly checks if writers are closed before accessing their tragic exception state.
753-774: Correct synchronization and state checks.The
ramBytesUsed()method properly acquires write locks before iterating, and therollback()method correctly checks if writers are open before attempting rollback operations.Also applies to: 794-811
|
❌ Gradle check result for 4b247a0: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
44-46: Retry wiring in tests looks correct; consider simplifying the mocking for readability.
- Updating all
computeIndexWriterIfAbsentForCriteriainvocations to passMAX_NUMBER_OF_RETRIESkeeps the tests consistent with the new API and the retry semantics; this looks correct.testMaxRetryCountWhenWriteLockDuringIndexingcorrectly verifies thattryAcquire()is invoked exactlyMAX_NUMBER_OF_RETRIEStimes when the lock is never obtained, and theLookupMapLockAcquisitionExceptionis thrown as expected.As a minor test ergonomics tweak, you could stub and verify directly on
writerLockinstead of going throughmap.current.mapReadLockin thewhen(...)andverify(...)calls. That would make the test a bit less coupled to the internal layout ofLiveIndexWriterDeletesMapandCriteriaBasedIndexWriterLookupwhile preserving the behavior being asserted.Also applies to: 72-77, 141-146, 197-202, 208-227
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
210-215: Visibility and mutability changes for nested types are acceptable but could use an explicit “for testing” annotation.Making
CriteriaBasedIndexWriterLookupandCriteriaBasedWriterLockmore visible, and relaxingmapReadLockandLiveIndexWriterDeletesMap.currentfromfinal, is understandable to support the new tests that need to mock and override these internals.To keep the public surface area tidy and signal intent, consider adding an explicit
@opensearch.internal(or similar) Javadoc tag or comment on these nested types/fields indicating that they are exposed primarily for testing. That helps discourage external production code from depending on them and makes future refactors easier.Also applies to: 301-301, 406-412
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (4)
- server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
- server/src/main/java/org/opensearch/index/IndexSettings.java
- CHANGELOG.md
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: detect-breaking-change
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!The import is necessary for the
Collectors.toSet()operation used in thegetCompositeFieldTypes()method below.
694-696: Verify initialization safety, caller expectations, and method naming alignment.The method now filters to return only
StarTreeMapper.StarTreeFieldTypeinstances fromcompositeMappedFieldTypes:
Potential NPE risk: Verify that
getCompositeFieldTypes()is never called beforeinternalMerge()initializescompositeMappedFieldTypesat line 552. If called during early initialization phases,.stream()could fail on a null reference.Semantic narrowing: Confirm whether the method name
getCompositeFieldTypes()still accurately reflects its behavior. If other composite field type implementations exist or may be added, consider renaming togetStarTreeFieldTypes()or updating documentation to clarify the filtering behavior.Performance: If
getCompositeFieldTypes()is called frequently in hot paths, consider caching the filtered result to avoid recreating the set on each invocation.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
691-705: Iteration refactors over current/old writer maps look correct and improve clarity.
getFlushingBytes()andgetPendingNumDocs()now iterate directly overliveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values(), summing per-child metrics before adding the accumulating writer’s values. This preserves behavior and is straightforward.getTragicException()now checks bothcurrentandoldchild writers for a tragic exception before falling back to the accumulating writer, which ensures group-specific failures are surfaced.ramBytesUsed()androllback()explicitly iterate over bothcurrentandoldwriters, and the use ofmapWriteLock.acquire()around theramBytesUsed()traversals is appropriate for a consistent snapshot.Overall, these loops are clear and consistent with the data structures being used; no issues from a correctness or concurrency standpoint.
Also applies to: 731-742, 757-772, 796-805
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Show resolved
Hide resolved
|
❌ Gradle check result for dbcae67: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @CHANGELOG.md:
- Line 34: Update the PR reference display text to include a leading '#' so it
matches other entries; change the link text
"([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" to
"(#[20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" in the
CHANGELOG.md entry so the PR is rendered as #20145 while keeping the same URL.
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java:
- Around line 905-941: In testRollbackWithWriterOnOld, avoid an unbounded join
and a silent catch: change refresher.join() to refresher.join(10000) (or another
reasonable timeout) to prevent CI hangs, and update the try/catch around
compositeIndexWriter.addDocuments(operation.docs(), operation.uid()) to either
remove the empty catch, assert the expected Error (if failure is intentional),
or rethrow/log it with a clear comment describing the expected behavior; ensure
the test documents the intent when swallowing the Error so future readers know
whether failure is expected.
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
401-424: Test name doesn’t match behavior.
testGetTragicExceptionWithExceptiondoesn’t inject any exception and assertsnull; consider renaming to reflect “no tragic exception in normal path”.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (5)
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
- server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
🧰 Additional context used
🧬 Code graph analysis (4)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (14)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
724-730: Clean removal of LookupMapLockAcquisitionException retry logic.The retry handling for
LookupMapLockAcquisitionExceptionhas been correctly removed along with the exception class deletion. The remaining retry logic for version conflicts remains intact.server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)
186-199: Appropriate no-op overrides for source derivation.The empty implementations are correct for a field mapper that is not part of an ingested document. The base class methods are properly overridden to exclude this field from source derivation validation and generation.
Minor note: The Javadoc mentions "Context Aware Segment" but this appears to be about the "Context Aware Grouping field" - consider clarifying the terminology.
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: Clean convenience method for test configuration.Good addition of a simplified
config(Store store)overload that delegates to the full method with sensible defaults.
509-564: Well-structured test utility for flushing behavior and tragic exception testing.The
FlushingIndexWriterFactoryprovides good control over directory injection and forces flushes after each write operation, which is useful for testing edge cases.Note on directory ownership: When
useFailingDirectorySupplierisfalse, the passed-indirectoryis added to the internaldirectorieslist and will be closed by this factory'sclose()method. Ensure test callers don't attempt to close these directories separately to avoid double-close issues.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (8)
131-144: Good documentation for pending docs tracking.The Javadoc clearly explains the purpose of
childWriterPendingNumDocsand acknowledges the temporary overshoot during refresh. UsingAtomicLongensures thread-safe updates.
352-357: Proper defensive check for closed lookup.Good addition to release the lock and return
nullif the lookup is closed after acquiring the lock. This prevents operations on stale/closed lookups during map rotation.
494-501: Improved loop handles map rotation during acquisition.The while loop now correctly continues trying to acquire a read lock on the current map until it successfully gets one that isn't closed. The comment explains the rationale well.
548-556: Correct pending docs accounting during refresh.The logic properly subtracts the old child writer's pending docs count after syncing to the parent writer, preventing double counting.
715-738: Robust handling of closed writers with tragic exception propagation.The utility method correctly handles
AlreadyClosedExceptionby checking for tragic exceptions. If a tragic exception exists, it's properly rethrown; otherwise, closed writers are safely skipped.
1054-1068: Verify:childWriterPendingNumDocsincrement for parent writer operation.
deleteInLuceneoperates onaccumulatingIndexWriter(the parent writer), but Line 1067 incrementschildWriterPendingNumDocs. Per the documentation at lines 131-143,childWriterPendingNumDocsshould track pending docs for "child level IndexWriters" only.Since
accumulatingIndexWriter.getPendingNumDocs()will already count this document internally, the increment at line 1067 may cause double-counting ingetPendingNumDocs()(line 743:childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()).Please verify if this increment is intentional or if it should be removed.
913-927: Correct pending docs tracking for addDocuments.The increment by
docs.size()properly accounts for all documents added in the batch operation.
851-866: Proper cleanup of child writers during rollback.Calling
rollback()on all child writers before the parent ensures no file leaks. The comment explains the rationale well.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
16-17: New imports look fine.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
40-40:Supplierimport is reasonable for the new directory factory patterns.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
Show resolved
Hide resolved
.../src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
848-865: Potential resource leak: If one child writer's rollback throws, remaining writers are not rolled back.The current implementation iterates through child writers and calls
rollback()on each, but if any throws an exception, the subsequent writers won't be rolled back. This could leave resources in an inconsistent state.🔧 Suggested fix using exception aggregation
public void rollback() throws IOException { if (shouldClose()) { - // Though calling rollback on child level writer seems like a redundant thing, but this ensures all the - // child level IndexWriters are closed and there is no case of file leaks. Incase rollback throws any - // exception close the shard as rollback gets called in close flow. And rollback throwin exception means - // there is already leaks. - for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values()) { - disposableIndexWriter.getIndexWriter().rollback(); - } - - for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.old.criteriaBasedIndexWriterMap.values()) { - disposableIndexWriter.getIndexWriter().rollback(); - } - - accumulatingIndexWriter.rollback(); + IOException firstException = null; + + for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values()) { + try { + disposableIndexWriter.getIndexWriter().rollback(); + } catch (IOException e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } + } + + for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.old.criteriaBasedIndexWriterMap.values()) { + try { + disposableIndexWriter.getIndexWriter().rollback(); + } catch (IOException e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } + } + + try { + accumulatingIndexWriter.rollback(); + } catch (IOException e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } + closed.set(true); + + if (firstException != null) { + throw firstException; + } } }Alternatively, leverage
IOUtils.close()which already handles exception aggregation (seelibs/common/src/main/java/org/opensearch/common/util/io/IOUtils.javalines 216-316).
1053-1067: Potential double-counting:childWriterPendingNumDocsis incremented for writes to the parent writer.The
deleteInLucenemethod writes toaccumulatingIndexWriter(the parent), but line 1066 incrementschildWriterPendingNumDocs. According to the comment on lines 130-143,childWriterPendingNumDocsshould only track "pendingNumDocs for child level IndexWriters."Since
getPendingNumDocs()returnschildWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()(line 742), this causes double-counting: once inchildWriterPendingNumDocsand once inaccumulatingIndexWriter.getPendingNumDocs().🐛 Suggested fix
private void deleteInLucene( Term uid, boolean isStaleOperation, IndexWriter currentWriter, Iterable<? extends IndexableField> doc, Field... softDeletesField ) throws IOException { if (isStaleOperation) { currentWriter.addDocument(doc); } else { currentWriter.softUpdateDocument(uid, doc, softDeletesField); } - - childWriterPendingNumDocs.incrementAndGet(); }The parent writer's pending docs are already tracked via
accumulatingIndexWriter.getPendingNumDocs()in thegetPendingNumDocs()method.
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 547-555: The loop is calling
childDisposableWriter.getIndexWriter().close() before calling
getPendingNumDocs(), which causes AlreadyClosedException; change the order
inside the CompositeIndexWriter.DisposableIndexWriter loop so you first call
long pendingNumDocsByOldChildWriter =
childDisposableWriter.getIndexWriter().getPendingNumDocs(), then close the
writer via childDisposableWriter.getIndexWriter().close(), then proceed with
accumulatingIndexWriter.addIndexes(directoryToCombine) and
IOUtils.closeWhileHandlingException(directoryToCombine), and finally update
childWriterPendingNumDocs.addAndGet(-pendingNumDocsByOldChildWriter).
In @server/src/main/java/org/opensearch/index/mapper/MapperService.java:
- Around line 693-697: getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes may be null before internalMerge(); also the current
filter restricts results to CompositeDataCubeFieldType which changes the
original contract. Fix by guarding against null and returning an empty set when
compositeMappedFieldTypes is null, and remove the instanceof filter so you
return all CompositeMappedFieldType instances as originally expected; update
getCompositeFieldTypes() to check compositeMappedFieldTypes == null (or use
Optional/Collections.emptySet()) and then return
compositeMappedFieldTypes.stream().collect(...) without the
CompositeDataCubeFieldType filter; ensure callers like MetadataMappingService
still receive the full Set<CompositeMappedFieldType>.
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java:
- Line 163: The AtomicBoolean variable named "run" declared in
CompositeIndexWriterForUpdateAndDeletesTests is unused dead code; remove the
declaration "AtomicBoolean run = new AtomicBoolean(true);" and the later
assignment that sets it to false so no unused field remains (search for the
symbol "run" within CompositeIndexWriterForUpdateAndDeletesTests to remove both
occurrences).
🧹 Nitpick comments (4)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
153-168: Consider simplifying the test structure.The try-catch with
fail()pattern is non-idiomatic. Since the test method already declaresthrows IOException, exceptions will naturally fail the test. The current pattern also doesn't verify any actual behavior ofcanDeriveSource()orderiveSource().♻️ Suggested simplification
public void testContextAwareFieldMapperWithDerivedSource() throws IOException { ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( "context_aware_grouping", fieldType, new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") ); LeafReader leafReader = mock(LeafReader.class); - try { - mapper.canDeriveSource(); - mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); - } catch (Exception e) { - fail(e.getMessage()); - } + // Verify no-op behavior doesn't throw + assertFalse(mapper.canDeriveSource()); // Or assertTrue if expected to return true + mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); }Consider adding assertions on the expected return values or verifying that
deriveSourcebehaves as a no-op (e.g., checking the builder state is unchanged if that's the expected behavior).server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)
520-554: Consider extracting the failing directory supplier to reduce duplication.The
dirSupplierpattern creating aFilterDirectorythat throwsOutOfMemoryErroronwriteBytesis duplicated across multiple tests (testRAMBytesUsedWithTragicExceptionOnCurrent,testFlushingBytesWithTragicExceptionOnCurrent,testTragicExceptionGetWithTragicExceptionOnCurrent, and their "OnOld" variants).♻️ Suggested helper method
// Add at class level or in base test class private Supplier<Directory> createFailingDirectorySupplier() { return () -> new FilterDirectory(newDirectory()) { @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { IndexOutput out = super.createOutput(name, context); return new FilterIndexOutput("failing output", "test", out) { @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { throw new OutOfMemoryError("Simulated write failure"); } }; } }; }Then tests can simply call
createFailingDirectorySupplier()instead of repeating the anonymous class.
548-550: Consider adding a comment explaining why the error is swallowed.The
catch (Error ignored) {}pattern is used intentionally to trigger tragic exception scenarios, but an empty catch block makes it harder to understand the test intent or debug if unexpected errors occur.♻️ Suggested improvement
try { compositeIndexWriter.addDocuments(operation.docs(), operation.uid()); - } catch (Error ignored) {} + } catch (Error ignored) { + // Expected: OutOfMemoryError triggers tragic exception state + }server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
714-737: Logic is correct; consider extracting the iteration pattern to reduce duplication.The exception handling correctly:
- Catches
AlreadyClosedException(normal during refresh transitions)- Re-throws if a tragic exception exists (propagates fatal errors)
However, the same iteration and exception-handling pattern is duplicated for current and old maps here and in
ramBytesUsedUtil.♻️ Suggested helper to reduce duplication
private long sumFromWriters( Iterable<DisposableIndexWriter> writers, java.util.function.ToLongFunction<IndexWriter> valueExtractor ) { long total = 0; for (DisposableIndexWriter dw : writers) { try { total += valueExtractor.applyAsLong(dw.getIndexWriter()); } catch (AlreadyClosedException e) { if (dw.getIndexWriter().getTragicException() != null) { throw e; } } } return total; }Then both methods can use:
long flushingBytes = sumFromWriters(current.criteriaBasedIndexWriterMap.values(), IndexWriter::getFlushingBytes) + sumFromWriters(old.criteriaBasedIndexWriterMap.values(), IndexWriter::getFlushingBytes);
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (2)
- CHANGELOG.md
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
🧰 Additional context used
🧬 Code graph analysis (5)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (2)
server/src/main/java/org/opensearch/index/search/SimpleQueryStringQueryParser.java (1)
Settings(317-450)server/src/main/java/org/opensearch/script/ContextAwareGroupingScript.java (1)
ContextAwareGroupingScript(22-44)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, macos-15)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: detect-breaking-change
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (15)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: The removal ofLookupMapLockAcquisitionExceptionretry handling is incomplete.The codebase still contains orphaned references to the removed exception:
IndexSettings.java(lines 515-525) definesINDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONsetting and includes a JavaDoc comment referencing the removed exception class.These remnants should be removed or updated to reflect that the exception no longer exists.
Likely an incorrect or invalid review comment.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
174-182: Verify: Delete operation not protected byacquireLock().Unlike other delete tests in this file (e.g., lines 39-49, 77-86, 123-132), this
deleteDocumentcall is not wrapped withcompositeIndexWriter.acquireLock(operation.uid().bytes()).Is this intentional to test concurrent access behavior, or is this an oversight? If intentional, a brief comment explaining the test's purpose would clarify the design choice.
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-206: LGTM!Good fix. Adding the explicit else branch ensures the codec is always set on the
IndexWriterConfig, preventing potential issues when context-aware is disabled. The explanatory comment provides helpful context about why the codec is set here rather than inCodecService.server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)
135-137: LGTM!The API change from
IterabletoList<ParseContext.Document>provides stronger typing and aligns with related changes inDocumentIndexWriterandCompositeIndexWriter.Minor observation: The
finalmodifier is applied todocsinaddDocuments(line 135) but not insoftUpdateDocuments(line 147). Consider making them consistent if desired, though this has no functional impact.Also applies to: 145-154
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: LGTM!Clean convenience overload that reduces boilerplate in tests using a custom
Store.
509-564: Test utility looks good; minor note on directory ownership.The
FlushingIndexWriterFactoryis a useful test harness for validating flush behavior and tragic exception handling.One consideration: When
useFailingDirectorySupplier.get()returnsfalse, the passeddirectoryparameter is added todirectoriesand will be closed byclose(). If a caller passes a directory they expect to manage separately, this could cause double-close issues. This is likely acceptable for test code, but worth being aware of when using this factory.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (3)
401-424: LGTM! Test verifies no tragic exception under normal operation.This test correctly validates that
getTragicException()returnsnullwhen indexing succeeds without errors.
556-605: LGTM! Proper concurrency testing for tragic exception handling during refresh.The test correctly sets up a scenario where:
- A tragic exception occurs on a child writer
- A refresh operation is initiated (moving the writer to "old" map)
- The test verifies
AlreadyClosedExceptionis thrown when accessingramBytesUsed()The use of
CountDownLatchfor synchronization andIOUtils.closeWhileHandlingExceptionfor cleanup follows best practices.
905-941: LGTM! Test validates rollback succeeds with writers in old map.The test correctly validates that
rollback()can complete successfully even when child writers exist in the old map during a refresh operation. The cleanup in thefinallyblock properly handles resources.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)
130-143: LGTM! Well-documented addition ofchildWriterPendingNumDocstracking.The documentation clearly explains:
- When
childWriterPendingNumDocsis incremented (add/update operations on child writers)- The intentional trade-off of potential temporary overshooting during refresh
- Why undershooting would be problematic
This aligns with the PR objective of fixing indexing regression.
351-356: LGTM! Proper guard against operating on a closed lookup.The added check correctly handles the race condition where the lookup could be closed between acquiring the lock and performing operations. Releasing the lock and returning
nullallows the caller to retry with the current map.
912-927: LGTM! Correctly tracks pending documents for batch operations.The implementation:
- Uses the new
List<ParseContext.Document>signature- Correctly increments
childWriterPendingNumDocsbydocs.size()after successful indexing- Properly acquires both the map read lock and the keyed lock for the document
765-779: LGTM! Comprehensive tragic exception detection across all writers.The implementation correctly checks for tragic exceptions in:
- All current map child writers
- All old map child writers
- The accumulating parent writer
Returns the first tragic exception found, which is the expected behavior for detecting fatal errors.
601-604: LGTM! Test utility method with appropriate visibility.The method is package-private and clearly documented as being for unit tests. This allows tests to verify read-lock behavior without exposing internals publicly.
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
56-67: LGTM! API signature change fromIterabletoListenables pending doc tracking.The change from
Iterable<ParseContext.Document>toList<ParseContext.Document>is necessary to support thechildWriterPendingNumDocstracking inCompositeIndexWriter, which requiresdocs.size(). This is a sensible API tightening since callers typically have aListanyway.
.../src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
Show resolved
Hide resolved
|
❌ Gradle check result for 10948d0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: RS146BIJAY <rishavsagar4b1@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (4)
850-865: Potential resource leak if rollback throws for any child writer.If
rollback()throws an exception for any child writer in the current or old map, subsequent child writers and the accumulating writer won't have theirrollback()called, potentially causing file/resource leaks.Consider using a pattern that ensures all rollbacks are attempted, similar to
IOUtils.closeWhileHandlingException:🐛 Proposed fix
if (shouldClose()) { // Though calling rollback on child level writer seems like a redundant thing, but this ensures all the // child level IndexWriters are closed and there is no case of file leaks. Incase rollback throws any // exception close the shard as rollback gets called in close flow. And rollback throwin exception means // there is already leaks. + Exception firstException = null; for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values()) { - disposableIndexWriter.getIndexWriter().rollback(); + try { + disposableIndexWriter.getIndexWriter().rollback(); + } catch (Exception e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } } for (DisposableIndexWriter disposableIndexWriter : liveIndexWriterDeletesMap.old.criteriaBasedIndexWriterMap.values()) { - disposableIndexWriter.getIndexWriter().rollback(); + try { + disposableIndexWriter.getIndexWriter().rollback(); + } catch (Exception e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } } - accumulatingIndexWriter.rollback(); + try { + accumulatingIndexWriter.rollback(); + } catch (Exception e) { + if (firstException == null) { + firstException = e; + } else { + firstException.addSuppressed(e); + } + } closed.set(true); + if (firstException != null) { + throw new IOException("Failed to rollback one or more writers", firstException); + } }
912-927: Missing validation for empty document list.Line 916 calls
docs.iterator().next()without checking if the list is empty. If an empty list is passed, this will throwNoSuchElementException.Consider adding validation:
🐛 Proposed fix
public long addDocuments(final List<ParseContext.Document> docs, Term uid) throws IOException { // We obtain a read lock on a child level IndexWriter and then return it. Post Indexing completes, we close this // IndexWriter. ensureOpen(); + if (docs.isEmpty()) { + throw new IllegalArgumentException("Cannot add empty document list"); + } final String criteria = getGroupingCriteriaForDoc(docs.iterator().next());
946-970: Consistent withaddDocuments- same empty list validation concern.Line 955 also calls
docs.iterator().next()without checking if the list is empty. The same validation should be added here for consistency.🐛 Proposed fix
public void softUpdateDocuments( Term uid, List<ParseContext.Document> docs, long version, long seqNo, long primaryTerm, Field... softDeletesField ) throws IOException { ensureOpen(); + if (docs.isEmpty()) { + throw new IllegalArgumentException("Cannot update with empty document list"); + } final String criteria = getGroupingCriteriaForDoc(docs.iterator().next());
1053-1067: Incorrect counter increment:deleteInLuceneoperates on parent writer, not child writer.
deleteInLuceneis called withaccumulatingIndexWriter(line 1049), which is the parent writer. IncrementingchildWriterPendingNumDocshere is incorrect because:
- The
childWriterPendingNumDocsfield is specifically for tracking child writer pending docs.- The parent writer (
accumulatingIndexWriter) already tracks its own pending docs internally viagetPendingNumDocs().- This causes double counting in
getPendingNumDocs()(line 742) which addschildWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs().🐛 Proposed fix - remove the increment
if (isStaleOperation) { currentWriter.addDocument(doc); } else { currentWriter.softUpdateDocument(uid, doc, softDeletesField); } - - childWriterPendingNumDocs.incrementAndGet(); }
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 547-555: The code calls
childDisposableWriter.getIndexWriter().getPendingNumDocs() after closing the
writer, which can return wrong values or throw; fix by retrieving pendingNumDocs
from the IndexWriter before closing it: for each
CompositeIndexWriter.DisposableIndexWriter in
markForRefreshIndexWritersMap.values(), obtain a local reference to its
IndexWriter, call getPendingNumDocs() and store the value, then get the
Directory, close the writer, addIndexes on accumulatingIndexWriter with the
Directory, close the Directory, and finally update childWriterPendingNumDocs
using the previously saved pending count; ensure you reference
DisposableIndexWriter, getIndexWriter(), getPendingNumDocs(),
accumulatingIndexWriter.addIndexes(...),
IOUtils.closeWhileHandlingException(...), and
childWriterPendingNumDocs.addAndGet(...) when making the change.
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 187-199: The Javadoc uses inconsistent terminology ("Context Aware
Segment") for a class and content type named ContextAwareGroupingFieldMapper /
CONTENT_TYPE ("context_aware_grouping"); update the Javadoc on the
canDeriveSource and deriveSource methods to refer to "Context Aware Grouping"
(or match CONTENT_TYPE) so comments consistently reflect
ContextAwareGroupingFieldMapper and its content type; ensure both method
comments and any related class-level docs use the same "Context Aware Grouping"
phrasing.
In @server/src/main/java/org/opensearch/index/mapper/MapperService.java:
- Around line 694-696: The method getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes is assigned in internalMerge() and may be null;
initialize compositeMappedFieldTypes at declaration to an empty Set (similar to
unmappedFieldTypes) or add a null-safe guard in getCompositeFieldTypes() to
return Collections.emptySet() when null, and for performance compute and cache
the filtered CompositeDataCubeFieldType subset during internalMerge() (e.g.,
assign both compositeMappedFieldTypes and a cached compositeDataCubeFieldTypes
Set there) so getCompositeFieldTypes() simply returns the cached set without
re-filtering.
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
163-186: UnusedAtomicBoolean runvariable.The
runvariable is initialized and set tofalseon line 185, but it's never read by therefresherthread. The thread unconditionally executesbeforeRefresh()and exits. This appears to be dead code, possibly a remnant from a different pattern used in other tests.🔧 Suggested cleanup
- AtomicBoolean run = new AtomicBoolean(true); Thread refresher = new Thread(() -> { latch.countDown(); try { compositeIndexWriter.beforeRefresh(); } catch (Exception ignored) {} }); refresher.start(); try { latch.await(); compositeIndexWriter.deleteDocument( operation.uid(), false, newDeleteTombstoneDoc(id), 1, 2, primaryTerm.get(), softDeletesField ); } finally { IOUtils.closeWhileHandlingException(lock.getMapReadLock()); - run.set(false); refresher.join();server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
401-424: Test name is misleading.The test
testGetTragicExceptionWithExceptionactually verifies thatgetTragicException()returnsnullwhen no tragic exception has occurred. The name suggests it tests the presence of an exception. Consider renaming for clarity.🔧 Suggested rename
- public void testGetTragicExceptionWithException() throws IOException { + public void testGetTragicExceptionReturnsNullWhenNoFailure() throws IOException {server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
493-500: Potential busy-wait spin loop.The
whileloop usingtryAcquire()will spin continuously if the lock is unavailable. SincetryAcquire()returnsnullboth when the lock cannot be acquired and when the lookup is closed, this could result in CPU-intensive busy waiting during contention.Consider adding a brief
Thread.yield()or usingtryAcquire(timeout)to reduce CPU overhead during contention:♻️ Suggested improvement
while (current == null || current.isClosed()) { // This function acquires a first read lock on a map which does not have any write lock present. Current keeps // on getting rotated during refresh, so there will be one current on which read lock can be obtained. // Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was // never applied on this map as write lock gets only during closing time. We are doing this instead of acquire, // because acquire can also apply a read lock in case refresh completed and map is closed. current = this.current.mapReadLock.tryAcquire(); + if (current == null) { + Thread.yield(); + } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
🚧 Files skipped from review as they are similar to previous changes (3)
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- CHANGELOG.md
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: Mend Security Check
🔇 Additional comments (19)
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-206: Helpful documentation for the codec configuration rationale.The added comment clearly explains why the codec is set here instead of in
CodecService, and howassociatedCriteriaties into per-document criteria resolution. This aids future maintainability.Minor stylistic note: "initialisation" uses British spelling; consider "initialization" for consistency if the project follows American English conventions.
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)
11-12: LGTM!The imports are correctly added for the new
deriveSourcemethod signature.server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: LGTM!The removal of
LookupMapLockAcquisitionExceptionhandling is consistent with the PR objectives. The remaining retry logic for version conflicts is preserved correctly, and theonCompletemethod continues to handle update retries and logging appropriately.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
146-196: Test logic for concurrent delete during refresh is sound.The test correctly validates delete behavior when a document exists in an old child writer during a concurrent refresh operation. The latch synchronization ensures proper ordering, and the assertions verify the expected state.
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
135-137: API change fromIterabletoListis compatible.The signature change maintains compatibility with Lucene's
IndexWriter.addDocuments(Iterable<?>)sinceListimplementsIterable. This change aligns with the interface updates inDocumentIndexWriterand enables callers to accesssize()when needed for the per-child-writer pending doc accounting introduced inCompositeIndexWriter.
145-154: LGTM!The
softUpdateDocumentsparameter type change fromIterabletoListis consistent with theaddDocumentschange and maintains Lucene API compatibility.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
520-554: LGTM!The test correctly validates that
ramBytesUsed()throwsAlreadyClosedExceptionwhen a tragic exception has occurred on the current writer. TheFlushingIndexWriterFactorypattern effectively simulates OOM failures.
556-605: LGTM!The test correctly validates tragic exception handling when the writer is in the "old" state during a concurrent refresh operation. The lock synchronization pattern ensures proper test ordering.
607-642: LGTM!The test validates that
ramBytesUsed()returns a positive value when the writer is in the "old" state without any tragic exception. Good counterpart to the tragic exception tests.
644-853: LGTM!The flushing bytes and tragic exception tests follow consistent patterns and provide good coverage for:
getFlushingBytes()behavior with tragic exceptions on current/old writersgetFlushingBytes()normal behavior with writer on old stategetTragicException()returning non-null when tragic exception occurred on current/old writersThe resource cleanup is handled properly across all tests.
905-941: LGTM!The test correctly validates that
rollback()succeeds when the writer is in the "old" state during a concurrent refresh. The synchronization pattern and cleanup are consistent with other tests in the class.server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
56-67: Breaking API change:Iterable→Listfor document parameters.The signature changes from
Iterable<ParseContext.Document>toList<ParseContext.Document>for bothaddDocumentsandsoftUpdateDocumentsare breaking changes for any external implementations ofDocumentIndexWriter.This change is intentional to support per-child-writer pending doc accounting in
CompositeIndexWriter, which requires callingsize()on the documents collection. BothLuceneIndexWriterandCompositeIndexWriterhave been updated with the newListparameter, and all internal callers already passListtypes.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (7)
130-143: LGTM!The
childWriterPendingNumDocsfield withAtomicLongis appropriate for thread-safe tracking of pending documents across child writers. The documentation clearly explains the increment scenarios and acknowledges that overshooting during refresh is acceptable since undershooting could be problematic.
351-356: LGTM!Defensive check after acquiring the lock ensures operations don't proceed on a closed lookup. The pattern of acquiring, checking, and releasing if invalid is correctly implemented.
601-604: LGTM!Package-private test helper method with clear documentation. Appropriate visibility for unit testing purposes.
714-737: LGTM!The utility method correctly handles
AlreadyClosedExceptionby only re-throwing when a tragic exception exists. The iteration over both current and old maps with the explained rationale for avoiding double-counting is sound.
765-779: LGTM!The iteration order (current → old → accumulating) for checking tragic exceptions is reasonable. Returning the first found exception is acceptable behavior.
805-828: LGTM!Consistent implementation pattern with
getFlushingBytesUtil. Proper handling ofAlreadyClosedExceptionwith tragic exception check.
1024-1047: LGTM!Correctly increments
childWriterPendingNumDocswhen adding delete entries to child writers (both current and old). The tombstone entries added viasoftUpdateDocumenton child writers should indeed be counted in the child pending docs.
Description
Fixing indexing regression and bug fixes for grouping criteria. For testing grouping criteria changes, enabled the grouping criteria on local and tested with setting criteria. Wil raise the changes for integ test enablement for CAS in a separate PR as that require decent changes in integ test as well.
Related Issues
#19919
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Bug Fixes
Improvements
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.